Search results for "Missing values"
showing 3 items of 3 documents
Estimating with kernel smoothers the mean of functional data in a finite population setting. A note on variance estimation in presence of partially o…
2014
In the near future, millions of load curves measuring the electricity consumption of French households in small time grids (probably half hours) will be available. All these collected load curves represent a huge amount of information which could be exploited using survey sampling techniques. In particular, the total consumption of a specific cus- tomer group (for example all the customers of an electricity supplier) could be estimated using unequal probability random sampling methods. Unfortunately, data collection may undergo technical problems resulting in missing values. In this paper we study a new estimation method for the mean curve in the presence of missing values which consists in…
Air quality and integration of short-term and long-term pollutant data
2008
Modelling PM10 is an important problem in statistical methodology, above all to explain the PM10 behaviour in space and time, since it has been linked to many adverse effects on human and environmental health. But the large spatial variability of the main traffic-related pollutants, and in particular here the PM10, implies the impossibility of obtaining from the data of the fixed stations a complete pictures of the atmospheric pollution in the urban areas. Information from fixed monitoring stations (long-term measurements) are therefore integrated with the ones deriving from mobile station (short-term measurements). Short-term measurements are incomplete and so it is necessary to integrate …
Toolbox for Distance Estimation and Cluster Validation on Data With Missing Values
2022
Missing data are unavoidable in the real-world application of unsupervised machine learning, and their nonoptimal processing may decrease the quality of data-driven models. Imputation is a common remedy for missing values, but directly estimating expected distances have also emerged. Because treatment of missing values is rarely considered in clustering related tasks and distance metrics have a central role both in clustering and cluster validation, we developed a new toolbox that provides a wide range of algorithms for data preprocessing, distance estimation, clustering, and cluster validation in the presence of missing values. All these are core elements in any comprehensive cluster analy…